Cryptographic device and associated methods

ABSTRACT

A cryptographic device includes an input stage receiving an input data block and a key data block made up of a plurality of sub-key data blocks, and generating a plurality of first signals therefrom. An intermediate stage is connected to the input stage and includes a plurality of substitution units. Each substitution unit substitutes data within a respective first signal. A diffuser is connected to the plurality of substitution units for mixing data to generate a diffused signal. An output stage is connected to the intermediate stage for repetitively looping back the diffused signal to the input stage for combination with a next sub-key data block. The output stage provides an output signal for the cryptographic device after the repetitively looping back is complete.

GOVERNMENT LICENSE RIGHTS

The U.S. Government has a paid-up license in this invention and theright in limited circumstances to require the patent owner to licenseothers on reasonable terms as provided for by the terms of contract No.MDA904-99-C-6511, awarded by the U.S. Government.

FIELD OF THE INVENTION

The present invention relates to the field of cryptography, and moreparticularly, to cryptographic algorithms, such as the advancedencryption standard (AES) algorithm.

BACKGROUND OF THE INVENTION

The new advanced encryption standard (AES) algorithm was designed forcommercial applications. As per its original specification, the AESalgorithm runs efficiently in both hardware and software. Thisrequirement not only increases the probability of a software-based bruteforce success, but also imposes a mathematical structure that may shrinkthe search space of the attack.

AES was formerly known as Rijndael and is the U.S. Government's newtype-3 algorithm. The algorithm is cryptographically strong, and isefficient in both hardware and software embodiments. The algorithmfeatures a scalable key length. This attribute, together with otheradvantages, ensures that the AES algorithm will be effective forcommercial applications for many years.

A complete description of the AES algorithm can be found in the FederalInformation Processing Standard (FIPS-197). For the AES algorithm to runefficiently in software, it was constructed with multiple rounds (orloops) of arithmetic functions that operate on byte-sized or 32 bitword-sized variables over GF(2⁸) and GF(2). Operations over these twoGalois Fields results in algorithm behavior that is highly non-linear.Each round includes a substitution operation, a row shift operation, acolumn mixing operation, and an addition operation for combining asub-key variable addition operation.

The substitution operation includes a non-linear byte substitution thatoperates independently on each byte of the 128 bit input to this stage.The substitution includes a series of linear operations over both GF(2⁸)and GF(2). The relationship between operations over these two fieldsresults in an overall mapping that is non-linear. This property providesstrength against linear and differential cryptanalysis.

Following the substitution operation is a straightforward row-wise byteshifting operation. Next, a column-wise polynomial transformation isapplied if a count indicates that the total number of rounds has notbeen completed yet. The transformation includes a third order polynomialmultiplication over GF(2⁸). The row shifting and column transformationoperations provide mixing or diffusion layers to the algorithm. Finally,a key variable addition operation is performed. This operation is astraightforward modulo-two addition of an input variable with theappropriate sub-key variable over GF(2).

The row shifting and column transformation operations provide layers ofdiffusion for the AES algorithm. However, there is no redundancy in theAES diffusion layers. The algorithm's cryptographic strength depends onthe mixing offered by the diffusion layer operations working togetherover multiple rounds. If an attack is identified that eliminates thecontribution of any of these operations, the overall cryptographicstrength of the algorithm will be compromised.

SUMMARY OF THE INVENTION

In view of the foregoing background, it is therefore an object of thepresent invention to enhance the cryptographic strength of acryptographic algorithm, such as the advanced encryption standard (AES)algorithm, for example.

This and other objects, features and advantages in accordance with thepresent invention are provided by a cryptographic device comprising aninput stage, an intermediate stage and an output stage. The input stagemay receive an input data block and a key data block comprising aplurality of sub-key data blocks, and generates a plurality of firstsignals therefrom. The intermediate stage may be connected to the inputstage and may comprise a plurality of substitution units, with eachsubstitution unit substituting data within a respective first signal. Adiffuser may be connected to the plurality of substitution units formixing data to generate a diffused signal.

The output stage may be connected to the intermediate stage forrepetitively looping back the diffused signal to the input stage forcombination with a next sub-key data block. The output stage may providean output signal for the cryptographic device after the repetitivelylooping back is complete. The output signal may be further combined witha final sub-key data block.

The cryptographic strength of the algorithm is advantageously increasedbecause of the added diffusion layer associated with the plurality ofsubstitution units and the diffuser connected thereto. Each substitutionunit may perform a non-linear substitution based upon a look-up table.The diffuser may comprise a shift register and a look-up tableassociated therewith for mixing the data. Alternately, the diffuser maycomprise a plurality of shift registers and a plurality of look-uptables associated therewith for mixing the data.

The output stage may also perform a row-shift operation on the diffusedoutput signal before being looped back to the input stage. Likewise, theoutput stage may also perform a column-mix operation on the diffusedoutput signal being looped back to the input stage. The output stage mayfurther comprise a counter for counting a number of times the diffusedoutput signal is looped back to the input stage.

Another aspect of the present invention is directed to a communicationsystem comprising a key scheduler providing the key data block, and acryptographic device connected thereto as defined above. The keyscheduler and cryptographic device may be formed as part of anapplication specific integrated circuit (ASIC), for example.

Yet another aspect of the present invention is directed to a method forconverting an input data block into an output signal for a cryptographicdevice. The method may comprise generating a plurality of first signalsbased upon the input data block and a key data block comprising aplurality of sub-key data blocks, and substituting data within eachfirst signal using a respective substitution unit. The method mayfurther comprise mixing data to generate a diffused signal using adiffuser connected to the respective substitution units, andrepetitively looping back the diffused signal for combination with anext sub-key data block before repeating the substituting and mixing. Anoutput signal for the cryptographic device is provided after therepetitively looping back is complete.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a communication system including a keyscheduler and a cryptographic device in accordance with the presentinvention.

FIG. 2 is detailed block diagram of the cryptographic device as shown inFIG. 1.

FIG. 3 is flowchart of the cryptographic algorithm being executed by thecryptographic device as shown in FIG. 1.

FIG. 4 is one embodiment of the diffuser as shown in FIG. 2.

FIG. 5 is a second embodiment of the diffuser as shown in FIG. 2.

FIG. 6 is a third embodiment of the diffuser as shown in FIG. 2.

FIG. 7 is a method for converting an input data block into an outputdata block for a cryptographic device in accordance with the presentinvention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention will now be described more fully hereinafter withreference to the accompanying drawings, in which preferred embodimentsof the invention are shown. This invention may, however, be embodied inmany different forms and should not be construed as limited to theembodiments set forth herein. Rather, these embodiments are provided sothat this disclosure will be thorough and complete, and will fullyconvey the scope of the invention to those skilled in the art. Likenumbers refer to like elements throughout, and prime and double primenotations are used to indicate similar elements in alternativeembodiments.

Referring initially to FIG. 1, a secure communication system 10comprises a key scheduler 12, a cryptographic device 14 and atransceiver 16. The key scheduler 12 provides a key data block to thecryptographic device 14. The cryptographic device 14 provides an outputsignal Y to the transceiver 16 based upon a received input data block Xand the key data block from the key scheduler 12. The key scheduler 12and the cryptographic device 14 may be formed as separate units, or theymay be part of an application specific integrated circuit (ASIC), forexample.

The cryptographic device 14 executes an algorithm that has enhancedcryptographic strength so that it is suitable for secure communicationsystems. The algorithm may be a modified advanced encryption standard(AES) algorithm, for example.

The algorithm in accordance with the present invention will now bediscussed with reference to the cryptographic device 14 as shown in FIG.1, and to a generalized flow diagram 15 as shown in FIG. 2. Thecryptographic device 14 may comprise an input stage 22, an intermediatestage 24 and an output stage 26.

The input stage 22 receives the input data block X (Block 100), and thekey data block (Block 102) from the key scheduler 12. The key data blockcomprises a plurality of sub-key data blocks as readily understood bythose skilled in the art. Each sub-key data block may be referred to asa round key. In the case of the AES algorithm, a 128 bit key data block,there are 10 sub-key data blocks, for a 192 bit key data block there are12 sub-key data blocks, and for a 256 bit key data block there are 14sub-key data blocks as shown in Block 102.

In the input stage 22, the input data block X and a first sub-key datablock are added together using a modulo-two unit 23 (Block 104). Inaccordance with the present invention, the output signal from themodulo-two unit 23 (Block 104) is divided into a plurality of firstsignals 25 a-25 n. For example, if the input data block X has a lengthof 128 bits, the output signal may be divided into 8 bit (1 byte)lengths resulting in n being equal to 16. The first signals may also begenerally represented by reference 25. Of course the size of the inputdata block X and the number n of first signals may vary depending on theintended application.

The intermediate stage 24 is connected to the input stage 22 andcomprises a plurality of substitution units 27 a-27 n. (Blocks 106 a-106n) The substitution units may also be generally represented by reference27. There is a respective substitution unit 27 for each first signal 25.The intermediate stage 24 may also comprise a diffuser 30 (Block 108)connected to the plurality of substitution units 27 a-27 n for mixingdata to generate a diffused signal. In the case of the AES algorithm,the plurality of substitution units 27 a-27 n and the diffuser 30 addredundancy to the existing diffusion layers. In other words, if anattack is identified that eliminates the contribution of any of theseoperations, the overall cryptographic strength of the algorithm will notbe compromised.

Using the AES algorithm as an example, the output stage 26 is connectedto the intermediate stage 24 for repetitively looping back the diffusedsignal to the input stage 22 for combination with a next sub-key datablock. However, before the diffused signal is looped back, it is passedthrough a row shift unit 32 (Block 110). A counter 34 (Block 112) countshow many times the repetitive loop has been performed. If anotherrepetitive loop is to be performed, the signal from the row shift unit32 is provided to a column mix unit 36 (Block 114).

The signal from the column mix unit 36 is fed back to the modulo-two addunit 23 (Block 104), wherein the looped back signal is added with a nextsub-key data block. This repetitive looping back continues until thecounter 34 reaches a predetermined count. When the predetermined counthas been reached, the signal is then passed to another modulo-two addunit 38 so that a final sub-key data block can be added (Block 116) tothe output signal. The signal from the modulo-two add unit 38 providesthe output signal for the cryptographic device 14 after the repetitivelylooping back is complete (Block 118).

The modifications providing an enhanced cryptographic strength will nowbe discussed in greater detail. Instead of a single substitution unit,there are a plurality of substitution units 27 a-27 n. The plurality ofsubstitution units 27 a-27 n substitute data within the plurality offirst signals 25 a-25 n which may be based upon random permutations ormathematical formulae.

Following the substitutions by the plurality of substitution units 27a-27 n may be a diffuser function. The diffuser function is chosen suchthat it cannot be specified over GF(2⁸) and runs relatively slow insoftware. A range of functions is possible so programmable or customerspecific requirements can be satisfied. These customized functions canbe retained as proprietary information to prevent proliferation of theresulting algorithm.

The substitution unit in the AES algorithm, for example, is based on afunction that provides optimal security against linear and differentialcryptanalysis. The issue is that it also allows an over-definedquadratic representation for the algorithm to be specified over GF(2⁸).This is a potential cryptographic vulnerability.

Rather than using the mapping specified in the AES standard, theplurality of substitution units 27 a-27 n are comprised of programmablefunctions. Since there are 16 substitution units in the illustratedembodiment, the entropy preserving random permutations associatedtherewith eliminate the possibility of any mathematical model existingover a single field.

The following criteria should be addressed in providing a plurality ofsubstitution units 27 a-27 n. Meeting these criteria results in amapping that should be secure against known linear and differentialcryptanalysis: 1) each differential characteristic has a probability ofat most ¼, and a one-bit input difference will not lead to a one-bitoutput difference;

-   -   2) each linear characteristic has a probability in the range        ½±¼, and a linear relation between one single bit in the input        and one single bit in the output has a probability in the range        ½±⅛; and 3) the nonlinear order of the output bits as a function        of the input bits is the maximum, namely 3. Simulations have        shown that of the approximately 10500 possible mappings, in        excess of 10⁴⁰⁰ of these will meet the above criteria.

The configuration of the diffuser 30 is governed primarily by the needto add computational complexity to the algorithm so that the speed of asoftware embedment is impacted. However, rather than use acryptographically benign function, an operation that provides enhancedmixing to the algorithm is used.

Of all of the possible embodiments for the diffuser 30, three arediscussed below with reference to FIGS. 4, 5 and 6. In one embodiment,the diffuser 30 comprises a shift register 60 and a look-up table 62.The minimum number of cycles that the diffuser 30 circulates is enoughto provide bit-wise mixing across the entire shift register. It isentropy preserving to eliminate the possibility of collisions occurring.The diffuser function is composed of a random mapping 62, a shiftregister 60 and a modulo-two add unit 64. The input variable is firstentered into the shift register 60. The register 60 is then shifted tothe right one bit at a time until its contents have been completelyre-circulated. With each shift, the least significant bit is modulo-twoadded to the output of the look-up table 62. The result is moved intothe most significant bit position of the register 60. When the contentsof the register 60 have been completely processed, an output isgenerated.

The look-up table 62 is a custom non-linear function that maps at least6 one-bit inputs to a single one-bit output. Each of the inputs is a tapconnected to an individual bit position in the register 60. The taplocations can be arbitrarily chosen with the constraint that no tap canbe connected to either the least significant or to the most significantbit positions on the shift register 60.

The look-up table 62 is a uniformly distributed mapping with the allzeros input location mapped to one and the all ones input locationmapped to zero. This constraint prevents the diffuser 30 from locking upfor certain values of the input. Note also that the processing time ofpower consumption of the function does not change depending on thestructure of the input or output. It therefore preserves the security ofthe algorithm. Custom mapping designs can also be held as proprietaryinformation to prevent proliferation of the resulting algorithm.

As examples, variations of the 1×128 diffuser 30 as shown in FIG. 4 willnow be discussed with reference to FIGS. 5 and 6. These embodiments 30′,30″ provide similar statistical mixing performance, but are designedusing parallel structures that execute faster in hardware than the 1×128diffuser 30.

FIG. 5 illustrates a diffuser 30′ constructed using a parallel array of16 registers 70 a′-70 n′ and 16 look up tables 72 a′-72 n′ (two of whichare only shown). Each register 70 a′-70 n′ is 8 bits wide. This designprovides mixing over a full 128 bit block size. However, the mixingoccurs on 8 bit segments of the input block in parallel. This allows asignificant increase in the performance for a hardware embedment withoutcreating a corresponding increase in speed for a software embedment. Aswith the previous design, each look-up table 72 a′-72 n′ is a uniformlydistributed mapping with the all zeros input location mapped to one andthe all ones input location mapped to zero.

FIG. 6 illustrates a diffuser 30″ constructed using a parallel array of32 registers 80 a″-80 ff″ and 16 look up tables 82 a″-82 p″. Here, themixing occurs on 4 bit segments of the input block in parallel andagain, each look-up table 82 a″-82 p″ is a uniformly distributed mappingwith the all zeros input location mapped to one and the all ones inputlocation mapped to zero.

Yet another aspect of the present invention is directed to a method forconverting an input data block X into an output signal Y for acryptographic device 14. Referring now to the flow diagram 17 in FIG. 7,from the start (Block 140), the method comprises generating a pluralityof first signals 25 a-25 n based upon the input data block X and a keydata block comprising a plurality of sub-key data blocks at Block 142.Data is substituted within each first signal using a respectivesubstitution unit 27 at Block 144.

The method further comprises mixing data to generate a diffused signalusing a diffuser 30 connected to the respective substitution units 27a-27 n, and repetitively looping back the diffused signal forcombination with a next sub-key data block before repeating thesubstituting and mixing. The looping back is repeated a predeterminednumber of times, and the method further comprises providing the outputsignal for the cryptographic device 14 after the repetitively loopingback is complete at Block 150. The method ends at Block 152.

Many modifications and other embodiments of the invention will come tothe mind of one skilled in the art having the benefit of the teachingspresented in the foregoing descriptions and the associated drawings.Therefore, it is understood that the invention is not to be limited tothe specific embodiments disclosed, and that modifications andembodiments are intended to be included within the scope of the appendedclaims.

1. A cryptographic device comprising: an input stage receiving an inputdata block and a key data block comprising a plurality of sub-key datablocks, and generating a plurality of first signals therefrom; anintermediate stage connected to said input stage and comprising aplurality of substitution units, each substituting data within arespective first signal, and a diffuser connected to said plurality ofsubstitution units for mixing data to generate a diffused signal; and anoutput stage connected to said intermediate stage for repetitivelylooping back the diffused signal to said input stage for combinationwith a next sub-key data block.
 2. A cryptographic device according toclaim 1 wherein the looping back is repeated a predetermined number oftimes; and wherein said output stage provides an output signal for thecryptographic device after the repetitively looping back is complete. 3.A cryptographic device according to claim 2 wherein the output signal isfurther combined with a final sub-key data block.
 4. A cryptographicdevice according to claim 1 wherein each substitution unit performs anon-linear substitution based upon at least one look-up table.
 5. Acryptographic device according to claim 1 wherein said diffusercomprises a shift register and a look-up table associated therewith. 6.A cryptographic device according to claim 1 wherein said diffusercomprises a plurality of shift registers and a plurality of look-uptables associated therewith.
 7. A cryptographic device according toclaim 1 wherein said output stage performs a row-shift operation on thediffused output signal before being looped back to said input stage. 8.A cryptographic device according to claim 1 wherein said output stageperforms a column-mix operation on the diffused output signal beinglooped back to said input stage.
 9. A cryptographic device according toclaim 1 wherein said output stage comprises a counter for counting anumber of times the diffused output signal is looped back to said inputstage.
 10. A communication system comprising: a key scheduler providinga key data block comprising a plurality of sub-key data blocks; and acryptographic device connected to said key scheduler and comprising aninput stage receiving an input data block and the key data block, andgenerating a plurality of first signals therefrom; an intermediate stageconnected to said input stage and comprising a plurality of substitutionunits, each substituting data within a respective first signal, and adiffuser connected to said plurality of substitution units for mixingdata to generate a diffused signal, and an output stage connected tosaid intermediate stage for repetitively looping back the diffusedsignal to said input stage for combination with a next sub-key datablock, said output stage providing an output signal for thecryptographic device after the repetitively looping back is complete.11. A communication system according to claim 10 wherein the outputsignal is further combined with a final sub-key data block.
 12. Acommunication system according to claim 10 wherein each substitutionunit performs a non-linear substitution based upon at least one look-uptable.
 13. A communication system according to claim 10 wherein saiddiffuser comprises a shift register and a look-up table associatedtherewith.
 14. A communication system according to claim 10 wherein saiddiffuser comprises a plurality of shift registers and a plurality oflook-up tables associated therewith.
 15. A communication systemaccording to claim 10 wherein said output stage performs a row-shiftoperation on the diffused output signal before being looped back to saidinput stage.
 16. A communication system according to claim 10 whereinsaid output stage performs a column-mix operation on the diffused outputsignal being looped back to said input stage.
 17. A communication systemaccording to claim 10 wherein said output stage comprises a counter forcounting a number of times the diffused output signal is looped back tosaid input stage.
 18. A method for converting an input data block intoan output signal in a cryptographic device, the method comprising:generating a plurality of first signals based upon the input data blockand a key data block comprising a plurality of sub-key data blocks;substituting data within each first signal using a respectivesubstitution unit; mixing data to generate a diffused signal using adiffuser connected to the respective substitution units; andrepetitively looping back the diffused signal for combination with anext sub-key data block before repeating the substituting and mixing.19. A method according to claim 18 wherein the looping back is repeateda predetermined number of times; and further comprising providing anoutput signal for the cryptographic device after the repetitivelylooping back is complete.
 20. A method according to claim 19 furthercomprising combining the output signal with a final sub-key data block.21. A method according to claim 18 wherein each substitution unitperforms a non-linear substitution based upon at least one look-uptable.
 22. A method according to claim 18 wherein the diffuser comprisesa shift register and a look-up table associated therewith.
 23. A methodaccording to claim 18 wherein the diffuser comprises a plurality ofshift registers and a plurality of look-up tables associated therewith.24. A method according to claim 18 further comprising performing arow-shift operation on the diffused output signal before being loopedback.
 25. A method according to claim 18 further comprising performing acolumn-mix operation on the diffused output signal being looped back.26. A method according to claim 18 further comprising counting a numberof times the diffused output signal is looped back.